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Abstract: 



\ We present a determination of a set of polarized parton distributions (PDFs) of the nucleon, 

ff^ ■ at next-to-leading order, from a global set of longitudinally polarized deep-inelastic scattering data: 
NNPDFpoll . 0. The determination is based on the NNPDF methodology: a Monte Carlo approach, with 
neural networks used as unbiased interpolants, previously applied to the determination of unpolarized 
^ ■ parton distributions, and designed to provide a faithful and statistically sound representation of PDF 
^ . uncertainties. We present our dataset, its statistical features, and its Monte Carlo representation. We 
summarize the technique used to solve the polarized evolution equations and its benchmarking, and the 
method used to compute physical observables. We review the NNPDF methodology for parametrization 
and fitting of neural networks, the algorithm used to determine the optimal fit, and its adaptation to 
the polarized case. We finally present our set of polarized parton distributions. We discuss its statistical 
properties, test for its stability upon various modifications of the fitting procedure, and compare it to 
other recent polarized parton sets, and in particular obtain predictions for polarized first moments of 
PDFs based on it. We find that the uncertainties on the gluon, and to a lesser extent the strange PDF, 
were substantially underestimated in previous determinations. 
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1 Introduction 



The interest in the determination of polarized parton distributions (PDFs) of the nucleon is largely 
related to the experimental discovery in the late 80s that the singlet axial charge of the proton is 
anomalously small [US]) soon followed by the theoretical realization [3l[l] that the perturbative behavior 
of polarized PDFs deviates from parton model expectations, according to which gluons decouple in the 
asymptotic limit. The theoretical interpretation of these results has spawned a huge literature, while at 
the same time experimental information on polarized PDFs from deep-inelastic scattering but also from 
a variety of other processes has been accumulating over the years (see e.g. [5] and references therein). 

First studies of the polarized structure of the nucleon were aimed at an accurate determination 
of polarized first moments (including detailed uncertainty estimates) [6-8J, but did not attempt a 
determination of a full PDF set, which was first proposed in Ref. but without uncertainty estimation. 
More recently, polarized PDF sets with uncertainties have been constructed by at least four groups 
(BB [ini[II], AAC [H], LSS dam! and DSSV [l5l[68]). These PDF sets slightly differ in the choice 
of datasets, the form of PDF parametrization, and in several details of the QCD analysis (such as the 
treatment of higher twist corrections), but they are all based on the standard Hessian methodology 
for PDF fitting and uncertainty determination, which has been widely used in the unpolarized case 
(see [161117] and references therein). This methodology is known |16j to run into difficulties especially 
when information is scarce, because of the intrinsic bias of the Hessian method based on a fixed parton 
parametrization. This is likely to be particularly the case for polarized PDFs, which rely on data both 
less abundant and less accurate than their unpolarized counterparts. 

In order to overcome these difficulties, the NNPDF collaboration has proposed and developed a new 
methodology for PDF determination [18fl29j. The NNPDF technique uses a robust set of statistical 
tools, which include Monte Carlo methods for error propagation, neural networks for PDF parametriza- 
tion, and genetic algorithms for their training. The NNPDF sets are now routinely used by the Tevatron 
and LHC collaborations in their data analysis and for data-theory comparisons. In this work we extend 
the application of the NNPDF methodology to the determination of polarized parton distributions of 
the nucleon. As we will see, some PDF uncertainties will turn out to be underestimated in existing PDF 
determinations: in particular those of the polarized gluon distribution, but also those of the strange 
distribution. 

The outline of this paper is as follows. In Sect. [2] we present the data set used to determine polarized 
PDFs, and we review the relationship between measured asymmetries and structure functions. In Sect. [3] 
we discuss the parametrization of polarized PDFs in terms of neural networks, and the construction of 
polarized structure functions. Then in Sect. [4] we discuss the minimization strategy. The results for the 
NNPDFpoll . polarized partons are presented in Sect. [5l and in Sect. [6]we discuss the phenomenological 
implications for the spin content of the proton and the test of the Bjorken sum rule. Finally in Sect.[7]we 
summarize our results and outline future developments. Some details on the benchmarking of polarized 
PDF evolution are given in the Appendix. 

2 Experimental data 

The bulk of the experimental information on (longitudinal) polarized proton structure comes from 
inclusive polarized deep-inelastic scattering with charged lepton beams. Deep-inelastic scattering with 
longitudinally polarized beams and targets allows a determination of the longitudinal structure function 
gi{x,Q^), which in turn admits a factorized expression in terms of polarized PDFs. Neutral-current 
deep-inelastic scattering does not allow to us to disentangle the contribution of quarks and antiquarks. 
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Using both proton and neutron (deuteron or ^He) targets it is possible to separate the isospin singlet and 
triplet quark contributions to structure functions, with the gluon determined from scaling violations. A 
weak control on the separation of the isospin singlet quark contribution into its SU(3) octet and singlet 
component is possible using baryon decays to fix the respective normalization of these contributions, 
with in principle their different scale dependence providing some constraint on their shape. 

Only charged-current deep- inelastic scattering would allow for full flavor separation [30]: this could 
be feasible with neutrino beams (such as available at a neutrino factory |31j). or perhaps very high- 
energy polarized charged lepton beams (such as available at an electron- ion collider [32j). Therefore, 
current constraints on flavor separation are only provided by semi-inclusive deep-inelastic scattering 
data or by polarized hadron collider processes, such as polarized Drell-Yan production in fixed target 
collisions and polarized W production at the relativistic Heavy Ion Collider (RHIC). Likewise, direct 
constraints on the medium and large-x polarized gluon require hadron and jet production either in 
fixed target experiments or at RHIC, while the small-x gluon can only be probed by going to higher 
energy, such as at a polarized Electron-Ion Collider. 

In this paper we will concentrate on inclusive longitudinally polarized DIS data, and thus we will only 
determine a subset of PDF combinations. This first polarized PDF set based on NNPDF methodology 
will then be available for inclusion of other datasets through the reweighting technique of Refs. [24,28j. 

We will first review the experimental observables which we use for the determination of polarized 
structure functions, and the information which various experiments provide on them. Then, we will 
summarize the features of the data we use, and finally the construction and validation of the Monte 
Carlo pseudodata sample from the input experimental data. 



2.1 Experimental observables and longitudinal polarized structure functions 



Standard perturbative factorization provides predictions for polarized structure functions gi{x,Q'^). 
However, experiments measure cross section asymmetries, defined by considering longitudinally polar- 
ized leptons scattering off a hadronic target, polarized either longitudinally or transversely with respect 
to the collision axis, from which the longitudinal (^y) and transverse (^_l) asymmetries are determined 
as 
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da^^ + dfj-^^ ' ' da^^ + da^^ ' 

The hadronic tensor for polarized, parity conserving deep-inelastic scattering can be parametrized 
in terms of four structure functions: two of them, Fi{x,Q'^) and F2{x,Q'^), characterize spin-averaged 
deep-inelastic scattering, while gi{x,Q'^) and g2{x,Q'^) appear when both the lepton beam and the 
nucleon target are in definite polarization states. For the conventional definition of the hadronic tensor 
in terms of structure functions, see e.g. j33j . 

The two polarized structure functions are related to the measurable asymmetries Eq. ([T]) by 



(l+72)(l + r/C) 

Fi{x,Q^) 
(l + 72)(l + r/C) 



^11 



^11 



+ u + 
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In Eqs. dSllS]) the dependence on the nucleon mass m is taken into account through the factor 

2 _ 4m^x^ 



(4) 
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which also appears in the definitions of the other kinematic factors in Eqs. dSJlH]): 



2y2 + 4(l-y)+7V' 
Here y is the standard lepton scaling variable, given by 

p ■ k 2xmE 



i^Vl-j/-7V/4 

' 

^ l-{l-y)e 

l + ei?(x,Q2)' ^"^^ 

_ 4(1 - y) - 72y2 



(10) 



in terms of the nucleon, lepton and virtual photon momenta, p, k and q, or, in the target rest frame, 
in terms of the energy E of the incoming lepton beam. 

The unpolarized structure function Fi and unpolarized structure function ratio R which enter the 
definition Eq. dJUSl) of the asymmetry may be expressed in terms of F2 and Fl by 

The longitudinal and transverse asymmetries are sometimes expressed in terms of the virtual photo- 
absorption asymmetries Ai and A2 according to 

A\\=D{Ai+rjA2) , A^ = d{A2-CAi), (13) 

where 

A,{x,Q')^^^^-^ , A2{x,Q^)^ ^\ ^ . (14) 

^1/2 + ^3/2 ^1/2 + ^3/2 

Recall that aj^^ and aj^^ cross sections for the scattering of virtual transversely polarized photons 
(corresponding to longitudinal lepton polarization) with helicity of the photon-nucleon system equal to 
1/2 and 3/2 respectively, and o"^^ denotes the interference term between the transverse and longitudinal 
photon-nucleon amplitudes. In the limit m? <C Eqs. (USD reduce to D = ^y/^i, d = A±/A2, thereby 
providing a physical interpretation of d and D as depolarization factors. 

Using Eqs. ([13]) in Eqs. we may express the structure functions in terms of Ai and A2 instead: 

9i{x,Q') = ^'I'^V [Ai{x,Q')+jA2{x,Q')] , (15) 
1 -|- 7^ 



. ^2. F,{x,Q^) 
92{x,Q ) - 



1 + 7' 
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We are interested in the structure function 51(2;, Q^), whose moments are proportional to nucleon 
matrix elements of twist-two longitudinally polarized quark and gluon operators, and therefore can be 
expressed in terms of longitudinally polarized quark and gluon distributions. Using Eqs. ([2]l3|) we may 
obtain an expression of it in terms of the two asymmetries Au, A±, or, using Eqs. (jl5til6p . in terms of 



the two asymmetries Ai, A2. Clearly, up to corrections of O 9i is fully determined by j4|j, which 

coincides with Ai up to O terms, while 52 is determined by A± or A2. It follows that, even though 
in principle a measurement of both asymmetries is necessary for the determination of gi, in practice 
most of the information comes from Ay or Ai, with the other asymmetry only providing a relatively 
small correction unless is very small. 

It may thus be convenient to express gi in terms of Ay and g2' 

mM') = ^^^ + '-^S.i:r.Q% (17) 

1 + 77/ U 77/ + 1 

or, equivalently, in terms of Ai and g2- 

gi{x,Q^) = A,{x,Q^)F,{x,Q^) + j^g2{x,Q^). (18) 

It is then possible to use Eq. (fT7|) or Eq. (fTSll to determine gi{x, Q"^) from a dedicated measurement of 
the longitudinal asymmetry, and an independent determination of g2{x,Q'^). 

In practice, experimental information on the transverse asymmetry and structure function g2 is 
scarce [33ti36] . However, the Wilson expansion for polarized DIS implies that the structure function 52 
can be written as the sum of a twist-two and a twist-three contribution [37j : 

g2{x, = gf^^^ q2) ^ ^t3(^^ Q2y (^g) 

The twist-two contribution to 52 is simply related to gi. One finds 

gf(x,Q^) = -giix,Q^)+ r^5i(y,Q2) (20) 

Jx y 

which in Mellin space becomes 

gl\N,Q'') = -^^g,{N,Q^). (21) 

It is important to note that is not suppressed by a power of ^ in comparison to g^, because in the 
polarized case the availability of the spin vector allows the construction of an extra scalar invariant. 
Nevertheless, experimental evidence suggests that g^ is compatible with zero at low scale ~ m^. 
Fits to g2^ [38l[39], as well as theoretical estimates of it [38l20j support the conclusion that 

which is known as the Wandzura-Wilczek |37j relation. 

We will thus determine gi , using Eq. (jl7p or Eq. (jlSp , from an experimental determination of the 
longitudinal asymmetry, and using the approximate Wandzura-Wilczek form Eq. ([22]) of g2- In order 
to test the dependence of results on this approximation, we will also consider the opposite assumption 
that 52 = identically. 
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2.2 The dataset: observables, kinematic cuts, uncertainties and correlations 

We use deep-inelastic lepton-nucleon scattering (DIS) data coming from all relevant experiments [21 
IMl lHS jHBflH] performed at CERN, SLAC and DESY. The experiments use different nucleon targets 
(protons, neutrons or deuterons). The main features of these data sets are summarized in Tab.[Tl where 
we show, for each experiment, the number of available data points, the kinematic range covered by the 
experiment, and the quantity which is published and which we use for the extraction of gi. This quantity 
is not the same for all experiments: the primary observable can be one of the many asymmetries or 
structure functions discussed in Sect. 12. H as we now summarize (individual experiments are labeled as 
in Tab. [I]). 

• EMC, SMC, SMClowx, COMPASS, HERMES97 

All these experiments have performed a measurement of ^y. They have then determined Ai from 
it using Eq. ()13p . under the assumption ~ 0. Therefore, what these experiments actually publish 
is a measurement of -jj. We determine gi from -jj using Eq. (jl7p . This is possible because D is 
completely fixed by Eq. ([6]) in terms of the unpolarized structure function ratio Eq. (|12p and of 
the kinematics. We determine the unpolarized structure function ratio using as primary inputs 
F2, for which we use the parametrization of Ref. [i8j49j, and Fl, which we determine from its 
expression in terms of parton distributions, using the MMPDF2. 1 NNLO parton set [26j . 

• HERMES 

This experiment has performed a measurement of A^^, and it publishes both A^^ and Ai (which 
is determined using Eq. (jl3p and a parametrization of A2). We use the published values of Ay, 
which are closer to the experimentally measured quantity, to determine gi through Eq. (|17p. 

• E143 

This experiment has taken data with three different beam energies, Ei = 29.1 GeV, E2 = 16.2 
GeV, = 9.7 GeV. For the highest energy both and A± are independently measured and Ai 
is extracted from them using Eq. (|13p: for the two lowest energies only Ay is measured and Ai is 
extracted from it using Eqs. ()15til6p while assuming the form Eq. ()22p for (72 • The values of Ai 
obtained with the three beam energies are combined into a single determination of Ai; radiative 
corrections are applied at this combination stage. Because of this, we must use this combined 
value of Ai, from which we then determine gi using Eq. psp . In order to determine y Eq. (jlOp . 
which depends on the beam energy, we use the mean of the three energies. 

• E154 

This experiment measures Ay and A± independently, and then extracts a determination of Ai. 
We use these values of Ai to determine gi by means of Eq. (fTSl) . 

• E155 

This experiment only measures Ay , from which -pr- is extracted using Eq. (llSp with the Wandzura- 
Wilczek form of g2 Eq. ()22p . In this case, we use these values of and we extract gi using Eq. (jlip 
for Fl, together with the parametrization of Ref. |181l49j for F2 and the expression in terms of 
parton distributions and the NNPDF2. 1 NNLO parton set |26j for F^, as in the other cases. 

We have excluded from our analysis all data points with < Q^ut — ^ GeV^, since below such 
energy scale perturbative QCD cannot be considered reliable. A similar choice of cut was made in 
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Table 1: Experimental data sets included in the present analysis. For each experiment we show the number of 
points before and after (in parenthesis) applying kinematic cuts, the kinematic range and the measured observable. 
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Figure 1: Experimental data in the {x,Q^) plane (after kinematic cuts). 



Refs. [6l-[8t llll[T2lll5j . We further impose a cut on the squared invariant mass of the hadronic final 
state = Q'^{1 — x)/x in order to remove points which may be affected by sizable higher-twist 
corrections. The cut is chosen based on a study presented in Ref. |50j . where higher twist terms were 
added to the observables, with a coefficient fitted to the data, and it was shown that the higher twist 
contribution becomes compatible with zero if one imposes the cut W"^ > W^^t — 6-25 GeV^. We will 
follow this choice, which excludes data points with large Bjorken-x at moderate values of the squared 
momentum transfer Q^, roughly corresponding to the bottom-right corner of the (x, (5^)-plane, see 
Fig. [TJ in particular, it excludes all available JLAB data [5TH53] . The number of data points surviving 
the kinematic cuts for each data set is given in parenthesis in Tab. [H 

As can be seen from the scatter plot in Fig.[Tl the region of the {x, Q^)-plane where data are available 
after kinematic cuts is roug hly restricted to 4 • lO'^ < x < 0.6 and 1 GeV^ < < 60 GeV^. In recent 
years, the coverage of the low-x region has been improved by a complementary set of SMC data [42] 
and by the more recent COMPASS data |45p46] . In the large-x region, information is provided at rather 
high by the same COMPASS data and at lower energy by the latest HERMES measurements |48j . 
In comparison to the dataset used in Refs. [&'-'8] several new datasets are being used, in particular the 
SMC [42J, HERMES and COMPASS [45,46j data. The dataset used in this paper is the same as 
that of Ref. , and also the same as the DIS data of the fit of Ref. [15] , which however has a wider 
data set which extends beyond inclusive DIS. 

Each experimental collaboration provides uncertainties on the measured quantities listed in the 
next-to-last column of Tab. [H Correlated systematics are only provided by EMC and E143, which 
give the values of the systematics due to the uncertainty in the beam and target polarizations, while 
all other experiments do not provide any information on the covariance matrix. For each experiment, 
we determine the uncorrelated uncertainty on gi by combining the uncertainty on the experimental 
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observable with that of the unpolarized structure function using standard error propagation. We 
include all available correlated systematics. These are provided by the experimental collaboration 
as a percentage correction to gi (or, alternatively, to the asymmetry Ai): we apply the percentage 
uncertainty on gi to the structure function determined by us as discussed in Sect. 12.21 (which, of course, 
is very close to the value determined by the experimental collaboration). 
We then construct a covariance matrix 

covp, = (^Yl + 9i,p9i,q, (23) 

where p and q run over the experimental data points, gi^p = gi{xp,Qp) {gi^q = gi{xq,Q'^)), crjp are 
the various sources of correlated uncertainty, and cjp the uncorrelated uncertainties, which are in turn 
found as a sum in quadrature of all uncorrelated sources of statistical cr,-!^^*^ and systematic af^'^'' 
uncertainty on each point: 



+EW:rO • (24) 



The correlation matrix is defined as 



Pp'i (tot) (tot) ' 

(25) 



where the total uncertainty ap"^'^ on the p-th data point is 



^(tot)\ =(aH)2 + ^(ag) . (26) 



2 

' ) 

i 

We show in Tab. [2] the average experimental uncertainties for each dataset, with uncertainties 
separated into statistical and correlated systematics. All values are given as absolute uncertainties and 
refer to the structure function gi^ which has been reconstructed for each experiment as discussed above. 
As in the case of Tab. [H we provide the values before and after kinematic cuts (if different). 

In Tab.dl we distinguish between experiments, defined as groups of data which cannot be correlated 
to each other, and datasets within a given experiment, which could in principle be correlated with 
each other, as they correspond to measurements of different observables in the same experiment, or 
measurements of the same observable in different years. Even though, in practice, only two experiments 
provide such correlated systematics (see Tab. [2]), this distinction will be useful in the minimization 
strategy, see Sect. [H below. 



2.3 Monte-Carlo generation of the pseudo-data sample 

Error propagation from experimental data to the fit is handled by a Monte Carlo sampling of the prob- 
ability distribution defined by data. The statistical sample is obtained by generating A'rep pseudodata 
replicas, according to a multigaussian distribution centered at the data points and with a covariance 
equal to that of the original data. Explicitly, given an experimental data point g^ip^^ = gi{xp,Qp), we 
generate k = I, . . . ,Nj-cp artificial points g[^p^^'^''^ according to 

9tf{x,Q% (27) 
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E143 
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-(-) 
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0.067 (0.062) 


-(-) 


0.067 (0.062) 




HER- A ID 


0.040 (0.034) 


-(-) 


0.040 (0.034) 



Table 2: Averaged statistical, correlated systematic and total uncertainties before and after (in parenthesis) 
kinematic cuts for each of the experimental sets included in the present analysis. Uncorrelated systematic 
uncertainties are considered as part of the statistical uncertainty and they are added in quadrature. All values 
are absolute uncertainties and refer to the structure function gi , which has been reconstructed for each experiment 
as discussed in the text. Details on the number of points and the kinematics of each dataset are provided in 
Tab. [3 
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Table 3: Table of statistical estimators for the mean value computed from the Monte Carlo sample with A^rep = 
10, 100, 1000 replicas. Estimators refer to individual experiments and are defined in Appendix B of Ref. '18' . 

where '^1^)'^) ^(m) p univariate gaussianly distributed random numbers, and af^^ and cTp"^ are respec- 
tively the relative correlated systematic and statistical uncertainty. Unlike in the unpolarized case, 
Eq. (j27p receives no contribution from normalization uncertainties, given that all polarized observables 
are obtained as cross section asymmetries. 

The number of Monte Carlo replicas of the data is determined by requiring that the central values, 
uncertainties and correlations of the original experimental data can be reproduced to a given accuracy by 
taking averages, variances and covariances over the replica sample. A comparison between expectation 
values and variances of the Monte Carlo set and the corresponding input experimental values as a 
function of the number of replicas is shown in Fig. [21 where we display scatter-plots of the central 
values and errors for samples of N^^.^ = 10, 100 and 1000 replicas. A more quantitative comparison can 
be performed by defining suitable statistical estimators (see, for example. Appendix B of Ref. |18j). 

We show in Tabs. [SHU the percentage error and the scatter correlation r (which is crudely speaking 
the correlation between the input value and the value computed from the replica sample) for central 
values and errors respectively . We do not compute values for correlations, as these, as seen in Tab. [21 
are only available for a very small number of data points from two experiments. Note that some large 
values of the percentage uncertainty are due to the fact that gi for some experiments can take values 
which are very close to zero. It is clear from both the tables and the plots that a Monte Carlo sample of 
pseudo-data with A'rep = 100 is sufficient to reproduce the mean values and the errors of experimental 
data to an accuracy which is better than 5%, while the improvement in going up to A^rep = 1000 is 
moderate. Therefore, we will henceforth use a A^rep = 100 replica sample as a default in the remainder 
of this paper. 

3 From polarized PDFs to observables 

3.1 Leading-twist factorization of the structure functions 

At leading twist, the polarized structure function gi for neutral-current virtual photon DIS is given in 
terms of the polarized quark and gluon distributions by 

/g2\ 

9ii.x, Q^) = ^[Cns ^ Mns + C5 as + 2nfCg Ag] . (28) 
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18.4 


7.4 
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.99005 


.99988 
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.89065 


.97318 


.99894 




HERMES 


19.5 


6.0 


1.6 


.91523 


.99237 


.99942 



Table 4: Table of statistical estimators for the errors computed from the Monte Carlo sample with iVj-ep 
10, 100, 1000 replicas. Estimators refer to individual experiments and are defined in Appendix B of Ref. '18'. 
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Figure 2: Scatter-plot of experimental versus artificial Monte Carlo mean central values and absolute uncertain- 
ties of polarized structure functions computed from ensembles made of iVrep = 10, 100, 1000 replicas. 



Here ?^/ is the number of active flavors, the average charge is given by (e^) = nj^ X^i=!i ^ i'^ terms of 
the electric charge Cj of the z-th quark flavor, ® denotes the convolution with respect to x, and the 
nonsinglet and singlet quark distributions are defined as 

"/ . 2 X 

A(7JV5 = 5^ Tir-1 (Ag. + AgO, AS = ^(A^^^ + A^^), (29) 
i=\ I ^ i=\ 

where IS.qi and A^j are the polarized quark and antiquark distributions of flavor i and Ay is the polarized 
gluon PDF. 

In the parton model, Eq. ([28|) reduces to 

91 {x,Q^) = \Y1 (Aft(x, + Ag-(x, g^)) , (30) 

i=l 
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but in perturbative QCD the parton model expression is not recovered even when — > because 
at large the first moment of the gluon distribution J^dxAg (a<j((5^))~^, so the gluon does not 
decouple from gi asymptotically. Be that as it may, below charm threshold, with Uf = 3, Eq. psp can 
be rewritten as 

5i(x,g2) ^ iAS(x,Q2) + 1aT3(x,Q2) + l^ATs{x,Q^), (31) 
9 12 6b 

in terms of the singlet quark-antiquark distribution AS(x,(5^), defined in Eq. (I29p . the isospin triplet 
combination 

Ar3(x, Ql) = Au{x, Ql) + Au{x, QI) - [Ad{x, QI) + Ad{x, QD] , (32) 
and the SU(3) octet combination 

ATs{x, Ql) = Au{x, Ql) + Au{x, Ql) + Ad{x, Ql) + Ad{x, Ql) - 2 [As{x, Ql) + As{x, Ql)] . (33) 

It is clear from Eqs. (|28ll30p that neutral current gi data only allow for a direct determination of the 
four polarized PDF combinations Ag, AS, AT^ and ATg. In principle, an intrinsic polarized component 
could also be present for each heavy flavour. However, we will neglect it here and assume that heavy 
quark PDFs are dynamically generated above threshold by (massless) Altarelli-Parisi evolution, in a 
zero-mass variable-flavor number (ZM-VFNS) scheme. In such a scheme all heavy quark mass effects are 
neglected. While they can be introduced for instance through the FONLL method [54j, these effects have 
been shown to be relatively small already on the scale of present-day unpolarized PDF uncertainties, 
and thus are most likely negligible in the polarized case where uncertainties are rather larger. 

The proton and neutron PDFs are related to each other by isospin, which we will assume to be 
exact, thus yielding 

AuP = Ad", AfiP = Au", AsP = As", (34) 

and likewise for the polarized anti-quarks. In the following we will always assume that PDFs refer to 
the proton. The flrst moment of all non-singlet combinations of quark and antiquark distributions are 
scale-independent because of axial current conservation, while the flrst moment of the singlet quark 
distribution is not. Because of the axial anomaly, the flrst moment of the singlet quark distribution is 
scale-dependent in the MS scheme. However, it may be convenient to choose a factorization scheme 
in which the flrst moment of the singlet quark distribution is also scale independent so that all the 
individual quark and antiquark spin fractions are scale independent. Several such schemes, including the 
so-called Adler-Bardeen (AB) scheme, were discussed in Ref. [6], where the transformation connecting 
them to the MS scheme was constructed explicitly. 

By means of the SU(2) or SU(3) flavour symmetry it is possible to relate the flrst moments of the 
nonsinglet C-even combinations {AT3 and ATg) to the baryon octet decay constants 03 and ag- 

a3= [ dxAT3ix,Q^), (35) 
Jo 

as = / dx ATs{x,Q^), (36) 
Jo 

whose current experimental values are [55] 

03 = 5^ = 1.2701 ± 0.0025, (37) 
as = 0.585 ± 0.025. (38) 
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A much larger uncertainty on the octet axial charge, up to about 30%, is found if SU(3) symmetry 
is violated [56]. Even though a detailed phenomenological analysis does not seem to support this 
conclusion [57], we will take as default this more conservative uncertainty estimation 

as = 0.585 ±0.176. (39) 



The impact of replacing this with the more aggressive determination given in Eq. (j38|) will be studied 
in Sect. [5X21 

Structure functions will be computed in terms of polarized parton distributions using the so-called 
NNPDF FastKernel method, introduced in Ref. [23]. In short, in this method the PDFs at scale are 
obtained by convoluting the parton distributions at the parametrization scale Qq with a set of Green's 
functions, which are in turn obtained by solving the QCD evolution equations in Mellin space. These 
Green's functions are then convoluted with coefficient functions, so that the structure function can be 
directly expressed in terms of the PDFs at the parametrization scale through suitable kernels K. In 
terms of the polarized PDFs at the input scale we have 

gl = {^gi,AS ® ASo + K^iAa ® ^90 + Kgi,+ ® (ATg^o + 5 ATg.o) } , (40) 

where the kernels i^gi,AS) -^gi,Ag) -f^gi,+ take into account both the coefficient functions and evo- 
lution. This way of expressing structure functions is amenable to numerical optimization, because all 
kernels can then be precomputed and stored, and convolutions may be reduced to matrix multiplications 
by projecting onto a set of suitable basis functions. 

The neutron polarized structure function is given in terms of the proton and deuteron ones as 

with COD = 0.05 the probability that the deuteron is found in a D state. Under the assumption of 
exact isospin symmetry, the expression of §1 in terms of parton densities is obtained from Eq. ()40p by 
interchanging the up and down quark PDFs, which amounts to changing the sign of AT3. 

The implementation of the polarized PDF evolution up to NLO has been benchmarked against the 
HOPPET evolution code [58] using the settings of the Les Houches PDF evolution benchmark tables [59j . 
This benchmarking is discussed in more detail in Appendix [A] We will assume the values Us (-^f) = 
0.119 for the strong coupling constant and rric = 1.4 GeV and = 4.75 GeV for the charm and bottom 
quark masses respectively. 



3.2 Target mass corrections to gi 

The leading twist expressions of structure functions given in Sect. 13.11 are corrected both by dynamical 
and kinematic higher-twist terms. The former are related to the contribution of higher twist operators 
to the Wilson expansion, and are generally expected to be small. The latter are related to target-mass 
corrections (TMCs), and because of their kinematical origin they can be included exactly: we do this 
following Ref. [60]. As discussed in Sect. 12. 1| we thus consistently include all nucleon mass effects, both 
in the relation between measured asymmetries and structure functions, and in the relation between the 
latter and parton distributions. 
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The target mass corrections are especially simple in Mellin space, where they take the form 

m'^ N{N + 1 



~gi{N,Q')=9i{N,Q') + 



Q2 (AT + 2)2 



(iV + 4) 51 {N + 2, g2) + 4^±^ ^2(iV + 2, 



+ 



m 



m'^ N{N -I) 

g2 (AT + 2)2 



^Jf^92{N + 2, Q2) - 51 (AT + 2, Q2) 



+ 



We denote by 51,2 (A^,Q^) the Mellin space structure functions with TMCs included, while gi^2iN, Q'^) 
are the structure functions determined in the m = limit. 

As discussed in Sect. 12. H in the absence of precise data on the structure function §2, we will either 
determine it using the Wandzura-Wilczek approximation Eq. (j22p (which is uncorrected by target-mass 
effects [60]), or, as a cross-check, simply setting it to zero. In either case, we may then determine gi 
Eq. (|32]) in terms of gi. 

In the former (Wandzura-Wilczek) case, substituting Eq. ()2ip in Eq. (I42p and taking the inverse 
Mellin transform, we get 

m^x^ (iV-2)2(iV- 1)" 



9i{x,Q ) 



1 
27ri 



dNx' 



-N 



1 + 



(44) 



where we have shifted — t- — 2 in the term proportional to m? . Inverting the Mellin transform we 
then obtain 



~gi{x,Q'^)=gi{x,Q'^) + 



2 2 
m X 



-5gi(x,Q^ 



^ dgi{x,Q^) ^ 
dx 



^Ug,{y,Q^)+Ag,{y,Q')\og- 

y \ y 



If instead 52 = 0, 

9iix,Q^) 

whence 

giix,Q'^) = gi{x,Q'^) + 



1 

27ri 



dNx 



-N 



1 + 



m^x^ {N^ - 4){N - 1) 



2 2 
m x 



dgi{x,Q'^ 



Ar2 



(45) 
(46) 



r dgi{x,Q^) f dy f , 2^ , a ^ ^ 
gi[x,Q ) -X / — (y,Q ) + 451(2/, Q ) log - 

"•^ Jx y \ y 



(47) 

The numerical implementation of Eqs. (j45p or Eq. (j47p is difficult, because of the presence of the 
first derivative of gi in the correction term. Therefore, we will include target mass effects in an iterative 
way: we start by performing a fit in which we set m = and at each iteration the target mass corrected 
gi structure function is computed by means of Eqs. ([l5] - H7|) using the 51 obtained in the previous 
minimization step. 



4 Neural networks and fitting strategy 

We will now briefly review the NNPDF methodology for parton parametrization in terms of neural 
networks, and their optimization (fitting) through a genetic algorithm. The details of the procedure 
have been discussed in previous NNPDF papers, in particular Refs. |20 p23l[6T] . Here we summarize the 
main steps of the whole strategy, and discuss in greater detail some points which are specific to the 
polarized case. 
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4.1 Neural network parametrization 

Each of the independent polarized PDFs in the evolution basis introduced in Sect. 13.11 AT,, Ag, AT^ 
and ATg, is parametrized using a multi-layer feed- forward neural network [27]. All neural networks 
have the same architecture, namely 2-5-3-1, which corresponds to 37 free parameters for each PDF, 
and thus a total of 148 free parameters. This is to be compared to about 10-15 free parameters for 
all other available determinations of polarized PDFs. This parametrization has been explicitly shown 
to be redundant in the unpolarized case, in that results are unchanged when a smaller neural network 
architecture is adopted: this ensures that results do not depend on the architecture |27] . Given that 
polarized data are much less abundant and affected by much larger uncertainties than unpolarized ones, 
this architecture is adequate also in the polarized case. 

The neural network parametrization is supplemented with a preprocessing function. In principle, 
large enough neural networks can reproduce any functional form given sufficient training time. However, 
the training can be made more efficient by adding a preprocessing step, i.e. by multiplying the output 
of the neural networks by a fixed function. The neural network then only fits the deviation from 
this function, which improves the speed of the minimization procedure if the preprocessing function is 
suitably chosen. We thus write the input PDF basis in terms of preprocessing functions and neural 
networks NNApdf as follows 

AT{x,Ql) = (1-x)"^1x-"^NNas(x) , 
An{x,Ql) = ^3(l-xnx-"3NNAT3(x) , 

AT8ix,Ql) = ^8(l-xr«x-"^^8NNAT3(x) , (48) 
^g{x,Ql) = (1-x)™«x-"«NNa^;(x). 

Of course, one should check that no bias is introduced in the choice of preprocessing functions. To 
this purpose, we first select a reasonable range of values for the large and small-x preprocessing expo- 
nents m and n, and produce a PDF determination by choosing for each replica a value of the exponents 
at random with uniform distribution within this range. We then determine effective exponents for each 
replica, defined as 

rriesiQ ) = hni — — — , (49 

x-^i ln(l — X) 

riesiQ } = hm — , (50) 

x-^o In - 

X 

where Af = AS, AT3, ATg, Ag. Finally, we check that the range of variation of the preprocessing 
exponents is wider than the range of effective exponents for each PDF. If it is not, we enlarge the range 
of variation of preprocessing, then repeat the PDF determination, and iterate until the condition is 
satisfied. This ensures that the range of effective large- and small-x exponents found in the fit is not 
biased, and in particular not restricted, by the range of preprocessing exponents. Our final values for the 
preprocessing exponents are summarized in Tab.O while the effective exponents obtained in our fit will 
be discussed in Sect. 15.51 It is apparent from Tab. [5] that the allowed range of preprocessing exponents 
is rather wider than in the unpolarized case, as a consequence of the limited amount of experimental 
information. It is enough to perform this check at the input evolution scale, Qq = 1 GeV^. 

Two of the PDFs in the parametrization basis Eq. (|48p . namely the nonsinglet triplet and octet 
ATs and ATg, are supplemented by a prefactor. This is because these PDFs must satisfy the sum rules 
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PDF 


m 


n 




[1.5,3.5] 


[0.2,0.7] 


Ag{x,Ql) 


[2.5,5.0] 


[0.4,0.9] 




[1.5,3.5] 


[0.4,0.7] 


AT8(x,Qg) 


[1.5,3.0] 


[0.1,0.6] 



Table 5: Ranges for the small and large x preprocessing exponents Eq. (|48)) . 



Eqs. (f35| [36|) . which are enforced by letting 

^3 ^ 



as 



dx{l- a;)™3x-"3NNAT3 {x) ' 
^8 = "-^ . (51) 

The integrals are computed numerically each time the parameters of the PDF set are modified. The 
values of as and a% are chosen for each replica as gaussianly distributed numbers, with central value 
and width given by the corresponding experimental values, Eqs. (|37|39p . 

4.2 Genetic algorithm minimization 



As discussed at length in Ref. [20] , minimization with a neural network parametrization of PDFs must 
be performed through an algorithm which explores the very wide functional space efficiently. This is 
done by means of a genetic algorithm, which is used to minimize a suitably defined figure of merit, 
namely the error function [20j . 

Ei^) = V f^r^^') - gf^'^^^'A ( (cov)-i) (gf - qT^^'A ■ (52) 

-'Vdat fjL^ ^ J \ J IJ \ ) 

Here g-^'^^^'^ is the value of the observable gi at the kinematical point / corresponding to the Monte 
Carlo replica /c, and g^^'^^^^ is the same observable computed from the neural network PDFs; the 
covariance matrix (cov)j'j is defined in Eq. (j23p . 

The minimization procedure we adopt follows closely that of Ref. [19] , to which we refer for a more 
general discussion. Minimization is perfomed by means of a genetic algorithm, which minimizes the 
figure of merit, Eq. ()52p by creating, at each minimization step, a pool of new neural nets, obtained by 
randomly mutating the parameters of the starting set, and retaining the configuration which corresponds 
to the lowest value of the figure of merit. 

The parameters which characterize the behaviour of the genetic algorithm are tuned in order to 
optimize the efficiency of the minimization procedure: here, we rely on previous experience of the 
development of unpolarized NNPDF sets. In particular, the algorithm is characterized by a mutation 
rate, which we take to decrease as a function of the number of iterations A'^itc of the algorithm according 
to [20] 

= ^f]lKl , (53) 

so that in the early stages of the training large mutations are allowed, while they become less likely 
as one approaches the minimum. The starting mutation rates are chosen to be larger for PDFs which 
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%ATf, 


5,0.5 


5,0.5 


2,0.2 


2,0.2 



Table 6: The initial values of the mutation rates for the two mutations of each PDF. 



Armut 








J^SW 


200 


50 


10 


5000 


2.5 



Table 7: Values of the parameters of the genetic algorithm. 

contain more information. We perform two mutations per PDF at each step, with the starting rates 
given in Tab. [6l The exponent has been introduced in order to optimally span the whole range 
of possible beneficial mutations and it is randomized between and 1 at each iteration of the genetic 
algorithm, as in Ref. |23j . 

Furthermore, following Ref. [23], we let the number of new candidate solutions depend on the stage 
of the minimization. At earlier stages of the minimization, when the number of generations is smaller 
than iV™"*, we use a large population of mutants, N^^^ ^ 1, so a larger space of mutations is being 
explored. At later stages of the minimization, as the minimum is approached, a smaller number of 
mutations A^^ut ^ -^mut is used. The values of the parameters A''™^*, N^^^^^ and N^^^. are collected in 
Tab. El 

Because the minimization procedure stops the fit to all experiments at once, we must make sure that 
the quality of the fit to different experiments is approximately the same. This is nontrivial, because of 
the variety of experiments and datasets included in the fit. Therefore, the figure of merit per datapoint 
for a given set is not necessarily a reliable indicator of the quality of the fit to that set, because some 
experiments may have systematically underestimated or overestimated uncertainties. Furthermore, 
unlike for unpolarized PDF fits, information on the experimental covariance matrix is only available 
for a small subset of experiments, so for most experiments statistical and systematic errors must be 
added in quadrature, thereby leading to an overestimate of uncertainties: this leads to a wide spread 
of values of the figure of merit, whose value depends on the size of the correlated uncertainties which 
are being treated as uncorrelated. 

A methodology to deal with this situation was developed in Ref. [23] . The idea is to first determine 
the optimal value of the figure of merit for each experiment, i.e. a set of target values E^^^^ for each 
of the i experiments, then during the fit give more weight to experiments for which the figure of merit 
is further away from its target value, and stop training experiments which have already reached the 
target value. This is done by minimizing, instead of the figure of merit Eq. (I52p . the weighted figure of 
merit 

at j=l 

(k) (k) 

where E) is the error function for the j'-th dataset with A'^dat i points, and the weights p) are given 

by 

1. If £;f ) > £;f■^^ then ^ = (ii;f V^f')", 

2. If £;f^ <^f■^^ thenpf^ = , 
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with n a free parameter which essentiahy determines the amount of weighting. In the unpolarized fits 
of Refs. |23 y 25 1 [26 t l29j the value n = 2 was used. Here instead we will choose n = 3. This larger value, 
determined by trial and error, is justified by the wider spread of figures of merit in the polarized case, 
which in turn is related to the absence of correlated systematics for most experiments. 

The target values E^^^^ are determined through an iterative procedure: they are set to one at first, 
then a very long fixed- length fit is run, and the values of Ei are taken as targets for a new fit, which 
is performed until stopping (according to the criterion to be discussed in Sect. 14.31 below). The values 
of Ei at the end of this fit are then taken as new targets until convergence is reached, usually after a 
couple iterations. 

Weighted training stops after the first A'^^ generations, unless the total error function Eq. ()52p 
is above some threshold E^''^ > S"". If it is, weighted training continues until E^^^ falls below the 
threshold value. Afterwards, the error function is just the unweighted error function Eq. (I52p computed 
on experiments. This ensures that the figure of merit behaves smoothly in the last stages of training. 
The values for the parameters N^^^ and E^"" are also given in Tab. [71 

4.3 Determination of the optimal fit 

Because the neural network parametrization is very redundant, it may be able to fit not only the 
underlying behaviour of the PDFs, but also the statistical noise in the data. Therefore, the best fit 
does not necessarily coincide with the absolute minimum of the figure of merit Eq. (j52p . We thus 

determine the best fit, as in Refs. [19ll20j . using a cross-validation method [62j: for each replica, the 

(i) 

data are randomly divided in two sets, training and validation, which include a fraction /^^ and 

fvai ~ ^ ~ ftr of the data points respectively. The figure of merit Eq. (|52l) is then computed for 
both sets. The training figure of merit function is minimized through the genetic algorithm, while the 
validation figure of merit is monitored: when the latter starts increasing while the former still decreases 
the fit is stopped. This means that the fit is stopped as soon as the neural network is starting to learn 
the statistical fluctuations of the points, which are different in the training and validation sets, rather 
than the underlying law which they share. 

In the unpolarized fits of Refs. |19 tl20l[23| l25 1[26ll29j equal training and validation fractions were 
uniforlmly chosen, f^p = /^^j =1/2. However, in this case we have to face the problem that the number 
of datapoints is quite small: most experiments include about ten datapoints (see Tab. [T]). Hence, it is 
difficult to achieve a stable minimization if only half of them are actually used for minimization, as we 
have explicitly verified. Therefore, we have chosen to include 80% of the data in the training set, i.e. 
f^^ = 0.8 and f^^^ = 0.2. We have explicitly verified that the fit quality which is obtained in this case 
is comparable to the one achieved when including all data in the training set (i.e. with f^p = 1.0 and 
fv2 ~ 0-0)' presence of a nonzero validation set allows for a satisfactory stopping, as we have 

checked by explicit inspection of the profiles of the figure of merit as a function of training time. 

In practice, in order to implement cross-validation we must determine a stopping criterion, namely, 
give conditions which must be satisfied in order for the minimization to stop. First, we require that the 
weighted training stage has been completed, i.e., that the genetic algorithm has been run for at least 
N^^^ minimization steps. Furthermore, we check that all experiments have reached a value of the figure 
of merit below a minimal threshold -Ethr- Note that because stopping can occur only after weighted 
training has been switched off, and this in turn only happens when the figure of merit falls below the 
value E^"", the total figure of merit must be below this value in order for stopping to be possible. 
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!ymax 
gen 


Ethr 


-^smear 


^smear 




^val 


20000 


8 


100 


100 


5 • 10-^ 


5 • lO-'* 



Table 8: Parameters for the stopping criterium. 



We then compute moving averages 



{EtT,vi,l(i)) = ^ ^wt;tr,val(0 ; (55) 

-''smear , ■ i i 

of the figure of merit Eq. (j54p for either the training or the vahdation set at the l-th genetic minimzation 
step. The fit is then stopped if 

< 1 - (5tr and r^ai > 1 + 5vai , (56) 

where 

n,. ,^ff' (57) 

\-L^tv\'' ^smcarj/ 
X-'-'vaH* ^smearj/ 

The parameter A'smear determines the width of the moving average; the parameter Agmear determines 
the distance between the two points along the minimization path which are compared in order to 
determine whether the figure of merit is increasing or decreasing; and the parameters JtD ^^vai are the 
threshold values for the decrease of the training and increase of the validation figure of merit to be 
deemed significant. The optimal value of these parameters should be chosen in such a way that the fit 
does not stop on a statistical fiuctuation, yet it does stop before the fit starts overlearning (i.e. learning 
statistical fiuctuation) . As explained in Ref . [23] , this is done studying the profiles of the error functions 
for individual dataset and for individual replicas. In order to avoid unacceptably long fits, training is 
stopped anyway when a maximum number of iterations N^^^ is reached, even though the stopping 
conditions Eq. (i56]l are not satisfied. This leads to a small loss of accuracy of the corresponding fits: 
this is acceptable provided it only happens for a small enough fraction of replicas. If a fit stops at 
-^gen^ without the stopping criterion having been satisfied, we also check that the total figure of merit 
is below the value E^"" at which weighted training is switched off. If it hasn't, we conclude that the 
specific fit has not converged, and we retrain the same replica, i.e., we perform a new fit to the same 
data starting with a different random seed. This only occurs in about one or two percent of cases. 
The full set of parameters which determine the stopping criterion is given in Tab. [8l 
An example of how the stopping criterium works in practice is shown in Fig. [3l We display the 
moving averages Eq. (I55p of the training and validation error functions (-E^^rvai)' computed with the 
parameter settings of Tab. [HI and plotted as a function of the number of iterations of the genetic algo- 
rithm, for a particular replica and for two of the experiments included in the fit. The wide fluctuations 
which are observed in the flrst part of training, up to the A''^^-th generation, are due to the fact that the 
weights which enter the definition of the figure of merit Eq. (j54p are frequently adjusted. Nevertheless, 
the downwards trend of the figure of merit is clearly visible. Once the weighted training is switched off, 
minimization proceeds smoothly. The vertical line denotes the point at which the stopping criterion is 
satisfied. Here, we have let the minimization go on beyond this point, and we see clearly that the mini- 

(k) 

mization has entered an overlearning regime, in which the validation error function E^J^ is rising while 
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Figure 3: Behaviour of the moving average Eq. (|55|) of the training and validation figure of merit for two 
different datasets included in a global fit (COMPASS-P and HERMES) as a function of training length. The The 
straight vertical line indicates the point at which the fit stops with the stopping parameters of Tab. \8\ The 
weighted training is switched off at A^^'j — 5000. 

(k) 

the training E^j. is still decreasing. Note that the stopping point, which in this particular case occurs 
at Ngl'^ = 5794, is determined by verifying that the stopping criteria are satisfied by the total figure 
of merit, not that of individual experiments shown here. The fact that the two different experiments 
considered here both start overlearning at the same point shows that the weighted training has been 
effective in synchronizing the fit quality for different experiments. 

4.4 Theoretical constraints 

Polarized PDFs are only loosely constrained by data, which are scarce and not very accurate. Theo- 
retical constraints are thus especially important in reducing the uncertainty on the PDFs. We consider 
in particular positivity and integrability. 

Positivity of the individual cross sections which enter the polarized asymmetries Eq. ([1]) implies 
that, up to power-suppressed corrections, longitudinal polarized structure functions are bounded by 
their unpolarized counterparts, i.e. 



At leading order, structure functions are proportional to parton distributions, so imposing Eq. (I59|) 
for any process (and a similar condition on an asymmetry which is sensitive to polarized gluons [63j), 
would imply 



for any pair of unpolarized and polarized PDFs / and A/, for all quark flavors and gluon i, for all x, and 
for all Q^. Beyond leading order, the condition Eq. (j59p must still hold, but it does not necessarily imply 
Eq. (160p . Rather, one should then impose at least a number of conditions of the form of Eq. (I59p on 
physically measurable cross-sections which is equal to the number of independent polarized PDFs. For 
example, in principle one may require that the condition Eq. (|59p is separately satisfied for each flavor, 
i.e. when only contributions from the i-th flavor are included in the polarized and unpolarized structure 
function: this corresponds to requiring positivity of semi-inclusive structure functions which could in 
principle be measured (and that fragmentation effects cancel in the ratio). A condition on the gluon 
can be obtained by imposing positivity of the polarized and unpolarized cross-sections for inclusive 
Higgs production in gluon-proton scattering [63], again measurable in principle if not in practice. 




(59) 




(60) 
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Because gi/Fi ~ x as x — )• [64j, the positivity bound Eq. ([Mj) is only significant at large enough 
X > 10~^. On the other hand, at very large x the NLO corrections to the LO positivity bound become 
negligible [631165] . Therefore, the NLO positivity bound in practice only differs from its LO counterpart 
Eq. (j60p in a small region 10~^ ~ 2; < 0.3, and even there by an amount of rather less that 10% |63j , 
which is negligible in comparison to the size of PDF uncertainties, as we shall see explicitly in Sec. O 

Therefore, we will impose the leading-order positivity bound Eq. (j60p on each flavor combination 
A(7i + A^j and on the gluon /S.g (denoted as A/j below). We do this by requiring 



\Mi{x,Q^)\<h{x,Q'')+ai{x,Q'' 



(61) 



where cTi(x, Q^) is the uncertainty on the corresponding unpolarized PDF combination /j(x, Q"^) at the 
kinematic point (x, Q^). This choice is motivated by two considerations. First, it is clearly meaningless 
to impose positivity of the polarized PDF to an accuracy which is greater than that with which the 
unpolarized PDF has been determined. Second, because the unpolarized PDFs satisfy NLO positivity, 
they can become negative and thus they may have nodes. As a consequence, the LO bound Eq. ([60]) 
would imply that the polarized PDF must vanish at the same point, which would be clearly meaningless. 

As in Ref. [23] positivity is imposed during the minimization procedure, thereby guaranteeing that 
the genetic algorithm only explores the subspace of acceptable physical solutions. This is done through 
a Lagrange multiplier Apos, i-e. by computing the polarized PDF at A'dat,pos fixed kinematic points 
(xp, Qq) and then adding to the error function Eq. (j52p a contribution 
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(62) 



This provides a penalty, proportional to the violation of positivity, which enforces Eq. (I6ip separately 
for all the non-zero quark-antiquark combinations. The values of the unpolarized PDF combination 
fjix,Q'^) and its uncertainty aj{x,Q'^) are computed using the NNPDF2.1 NNLO PDF set [25], while 
^^jnet)(k) ^j^^ corresponding polarized PDF computed from the neural network parametrization for 



the k-th replica. The polarized and unpolarized PDFs are evaluated at N^a,t,pos 
equally spaced in the interval 

X € [10^2^0.9] . 



20 points with x 



(63) 



Positivity is imposed at the initial scale Qq = 1 GeV^ since once positivity is enforced at low scales, it 



is automatically satisfied at larger scales [631165] . After stopping, we finally test the positivity condition 
Eq. (16ip is satisfied on a grid of A'dat.pos = 40 points in the same intervals. Replicas for which positivity 
is violated in one or more points are discarded and retrained. 

In the unpolarized case, in which positivity only played a minor role in constraining PDFs, a fixed 
value of the Lagrange multiplier Apos was chosen. In the polarized case it turns out to be necessary to 
vary the Lagrange multiplier along the minimization. Specifically, we let 
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(64) 



This means that the Lagrange multiplier increases as the minimization proceeds, starting from A 



pes 



at the first minimization step, N, 



gen 



1, up to A 



pes 



Amax > 1 when N, 



gen 



After Nx^ 
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generations Apos is then kept constant to Amax- The rationale behind this choice is that the genetic 
algorithm can thus learn experimental data and positivity at different stages of minimization. During 
the early stages, the contribution coming from the modified error function Eq. ()62p is negligible, due to 
the moderate value of the Lagrange multiplier; hence, the genetic algorithm will mostly learn the basic 
shape of the PDF driven by experimental data. As soon as the minimization proceeds, the contribution 
coming from the Lagrange multiplier increases, thus ensuring the proper learning of positivity: at this 
stage, most of the replicas which will not fulfill the positivity bound will be discarded. 
The final values of iVA,^ax = 2000 and A 

jnax — 10 have been determined as follows. First of all, we 
have performed a fit without any positivity constraint and we have observed that data were mostly learnt 
in about 2000 generations: hence we have taken this value for Nx^^^- Then we have tried different values 
for Amax until we managed to reproduce the same obtained in the previous, positivity unconstrained, 
fit. This ensures that positivity is not learnt to the detriment of the global fit quality. 

Notice that the value of Amax is rather small if compared to the analogous Lagrange multiplier used 
in the unpolarized case [25j. This depends on the fact that, in this latter case, positivity is learnt at 
the early stages of minimization, when the error function can be much larger than its asymptotic value: 
a large Lagrange multiplier is then needed to select the best replicas. Also, unpolarized PDFs are 
quite well constrained by data and positivity is almost automatically fulfilled, except in some restricted 
kinematic regions; only a few replicas violate positivity and need to be penalized. This means that the 
behaviour of the error function Eq. (j52p . which governs the fitting procedure, is essentially dominated 
by data instead of positivity. 

In the polarized case, instead, positivity starts to be effectively implemented only after some mini- 
mizaton steps, when the error function has already decreased to a value of a few units. Furthermore, we 
have checked that, at this stage, most of replicas slightly violate the positivity condition Eq. (I6ip : thus, 
a too large value of the Lagrange multiplier on the one hand would penalize replicas which are good in 
reproducing experimental data and only slightly worse in reproducing positivity; on the other, it would 
promote replicas which fulfill positivity but whose fit to data is quite bad. As a consequence of this 
behaviour, the convergence of the minimization algorithm would be harder to reach. We also verified 
that, using a value for the Lagrange multiplier up to Apos = 100 leads to no significant improvement 
neither in the fulfillment of positivity requirement nor in the fit quality. We will show in detail the 
effects of the positivity bound Eq. (j6ip on the fitted replicas and on polarized PDFs in Sect. [5j 

Finally, as already mentioned, we impose an integrability constraint. The requirement that polarized 
PDFs be integrable, i.e. that they have finite first moments, corresponds to the assumption that the 
nucleon matrix element of the axial current for the i-th fiavor is finite. The integrability condition is 
imposed by computing at each minimization step the integral of each of the polarized PDFs in a given 
interval. 



with xi and X2 chosen in the small x region, well below the data points, and verifying that in this 
region the growth of the integral as xi decreases for fixed X2 is less than logarithmic. In practice, we 
test for the condition 



with xi < x'l- Mutations which do not satisfy the condition are rejected during the minimization 
procedure. In our default fit, we chose xi = 10~^, x'^ = 2 ■ 10~^ and X2 = 10~^. 
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5 Results 



We now present the main result of this paper, namely the first determination of a polarized PDF set 
based on the NNPDF methodology, NNPDFpoll .0. We will first illustrate the statistical features of our 
PDF fit, then compare the NNPDFpoll .0 PDFs to other recent polarized parton sets pn[T2t [T ^ [T5] . We 
will finally discuss the stability of our results upon the variation of several theoretical and methodological 
assumptions: the treatment of target-mass corrections, the use of sum rules to fix the triplet and octet 
axial charges, the implementation of positivity of PDFs, and preprocessing of neural networks and its 
impact on small and large x behaviour. 

We will not discuss here the way predictions for PDFs and uncertainties are obtained from NNPDF 
replica sets, for which we refer to general reviews, such as Ref. |66j . 

5.1 Statistical features 

The statistical features of the NNPDFpoll . analysis are summarized in Tabs. [9l ll0l for the full dataset 
and for individual experiments and sets respectively. The error function {E) Eq. ()52p shown in the 



NNPDFpoll. 



Xtot 

{E) ± OE 
{E,,} ± aE,, 

{Eva\) ± Cfival 

(TL) ± aTL 


0.77 
1.82 ± 0.18 
1.66 ± 0.49 
1.88 ± 0.67 
6927 ± 3839 




0.91 ± 0.12 



Table 9: Statistical estimators for NNPDFpoll. with iVicp — 100 replicas. 

tables both for the total, training and validation datasets is the figure of merit for the quality of the 
fit of each PDF replica to the corresponding data replica. The quantity which is actually minimized 
during the neural network training is this figure of merit for the training set, supplemented by weighting 
in the early stages of training according to Eq. (j54p and by a Lagrange multiplier to enforce positivity 

according to Eq. (j62p . In the table we also show the average over all replicas (^Xtot^'^ °f Xtot^ computed 
for the fc-th replica, which coincides with the figure of merit Eq. (j54p . but with the data replica g^^^^'^^^^ 
replaced by the experimental data Qj'^'^^''- We finally show Xtot) which coincides with the figure of merit 
Eq. (fSlj) . but again with ^j^''*-"^^^ replaced by g^f^^\ and also with g^^'^^'^^^'^ replaced by ^5/°'^*''^'^^^, i.e. 
the average of the observable over replicas, which provides our best prediction. The average number of 
iterations of the genetic algorithm at stopping, (TL), is also given in this table. 

The distribution of 'X^^^\ ^ti \ training lengths among the A^rep = 100 replicas are shown in 
Fig. m and Fig. [5] respectively. Note that the latter has a long tail which causes an accumulation of 
points at the maximum training length, N^^. This means that there is a fraction of replicas that do 
not fulfill the stopping criterion. This may cause a loss in accuracy in outlier fits, which however make 
up fewer than 10% of the total sample. 

The features of the fit can be summarized as follows: 

• The quality of the central fit, as measured by its Xtot = 0.77, is good. However, this value 
should be taken with care in view of the fact that uncertainties for all experiments but two 
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HER97-A1N 
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0.79 


1.79 ± 0.30 




HER- A IP 


0.44 


1.49 ± 0.39 




HER- A ID 


1.13 


2.09 ± 0.50 



Table 10: Same as Tab. [9] but for individual experiments. 

are overestimated because the covariance matrix is not available and thus correlations between 
systematics cannot be properly accounted for. This explains the value lower than one for this 
quantity, which would be very unlikely if it had included correlations. 

• The values of Xtot ^'^'^ (-^) differ by approximately one unit. This is due to the fact that replicas 
fluctuate within their uncertainty about the experimental data, which in turn are gaussianly 
distributed about a true value [l9]: it shows that the neural net is correctly reproducing the 
underlying law thus being closer to the true value. This is confirmed by the fact that (x^*''^^) is 
of order one. 

• The distribution of for different experiments (also shown as a histogram in Fig. [6]) shows 
sizable differences, and indeed the standard deviation (shown as a dashed line in the plot) about 
the mean (shown as a solid line) is very large. This can be understood as a consequence of the 
lack of information on the covariance matrix: experiments where large correlated uncertainties 
are treated as uncorrelated will necessarily have a smaller value of the x^- 

5.2 Parton distributions 

The NNPDFpoll.O parton distributions, computed from a set of A^rep = 100 replicas, are displayed in 
Fig. [7] at the input scale Q\ = \ GeV^, in the PDF parametrization basis Eq. ()48p as a function of x 
both on a logarithmic and linear scale. In Figs. [8][9] the same PDFs are plotted in the flavour basis, and 
compared to other available NLO PDF sets: BBIO [llj and AAC08 [E] in Fig. El and DSSV08 [15j 
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Figure 4: Distribution of x^'^'^^ and E^^'' over the sample of N^cp — 100 replicas. 
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Figure 5: Distribution of training lengths over the sample of Mop = 100 replicas. 



in Fig. [9l We do not show a direct comparison to the LSS polarized PDFs |14j because there are 
no publicly available routines for the computation of PDF uncertainties for this set. Note that the 
dataset used for the BBIO determination contains purely DIS data, and that for AAC contains DIS 
supplemented by some high-py RHIC pion production data: hence they are directly comparable to our 
PDF determination. The DSSV08 determination instead includes, on top of DIS data, polarized jet 
production data, and, more importantly, a large amount of semi-inclusive DIS data which in particular 
allow for flavour-antiflavour separation and a more direct handle on strangeness. All uncertainties in 
these plots correspond to the nominal 1-a error bands. 

The main conclusions of this comparison are the following: 

• The central values of the Au + Au and the Ad + Ad are in reasonable agreement with those of 
other parton sets. The NNPDFpoll . results are in best agreement with DSSV08, in slightly worse 
agreement with AAC08, and in worst agreement with BBIO. Uncertainties on these PDFs are 
generally slightly larger for NNPDF than for other sets, especially DSSV, which however is based 
on a much wider dataset. 
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Figure 6: Value of the per data point for tlie datasets included in the NNPDFpoll.O reference fit, listed in 
Tab. [TOl The horizontal line is the unweighted average of these over the datasets and the black dashed lines 
give the one-sigma interval about it. 



• The NNPDFpoll . determination of As + As is affected by a much larger uncertainty than BBIO 
and AAC08, for almost all values of x. The AAC08 and BBIO strange PDFs fall well within the 
NNPDFpoll.O uncertainty band. 

• The NNPDFpoll . determination of As + As is inconsistent at the two sigma level in the medium- 
small X ~ 0.1 region with DSSV08, which is also rather more accurate, as one would expect as 
it includes semi-inclusive data (in particular for production of hadrons with strangeness). This 
suggests a tension between the inclusive analysis data and the semi-inclusive analysis. 

• The gluon PDF is affected by a large uncertainty, rather larger than any other set, especially at 
small X. In particular, the NNPDFpoll.O polarized gluon distribution is compatible with zero for 
all values of x. 

• Uncertainties on the PDFs in the regions where no data are available tend to be larger than those 
of other sets. At very large values of x the PDF uncertainty band is largely determined by the 
positivity constraint. 

Finally, in Fig. [10] we compare the structure function gi{x,Q'^) for proton, deuteron and neutron, 
computed using NNPDFpoll.O (with its one-cr uncertainty band) to the experimental data included in 
the fit. Experimental data are grouped in bins of x with a logarithmic spacing, while the NNPDF 
prediction and its uncertainty are computed at the central value of each bin. 
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gure 7: The NNPDFpoll . polarized parton distributions at Qq = 1 GeV^ in the parametrization basis plotted 
a function of a;, on a logarithmic (left) and linear (right) scale. 
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Figure 8: Comparison of the NNPDFpoll . PDFs (in the flavour basis) and the BBIO [n] and AAC08 [l2] PDFs. 
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Figure 9: Comparison of tlie NNPDFpoll.O PDFs (in the flavour basis) and the DSSVOS PDFs [15]. 
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Fit 


NNPDFpoll.O 52 = .g^* 


NNPDFpoll.O m = 


NNPDFpoll.O g2 = 


Xtot 
(E) ± (TE 

(Etr) ± aE,, 

(E'val) ± O-E^^i 


0.77 
1.82 ± 0.18 
1.66 ± 0.49 
1.88 ± 0.67 


0.78 
1.81 ± 0.16 
1.62 ± 0.50 
1.84 ± 0.70 


0.75 
1.83 ± 0.15 
1.70 ± 0.38 
1.96 ± 0.56 




0.91 ± 0.12 


0.90 ± 0.09 


0.86 ± 0.09 



Table 11: The statistical estimators of Tab. ^ (obtained assuming g2 — g^^) compared to a fit with m = or 
with 52 = 0. 

The uncertainty band in the NNPDFpoll.O result is typically smaller than the experimental errors, 
except at small-x where a much more restricted dataset is available; in that region, the uncertainties 
are comparable. Scaling violations of the polarized structure functions are clearly visible, especially for 

despite the limited range in Q^. 

5.3 Stability of the results 

Our results have been obtained with a number of theoretical and methodological assumptions, discussed 
in Sects. [Sllll We will now test their upon variation of these assumptions. 

5.3.1 Target-mass corrections and g2. 

We have consistently included in our determination of gi corrections suppressed by powers of the 
nucleon mass which are of kinematic origin. Thus in particular, as explained in Sec. 13. 2| we have 
included target-mass corrections (TMCs) up to first order in rr? jQ'^ . Furthermore, both TMCs and 
the relation between the measured asymmetries and the structure function g\ involve contributions to 
the structure function g2 proportional to powers of m? /Q"^ which we include according to Eq. (jl7p or 
Eq. (fT8]l (see the discussion in Sect. 12. 2p . Our default PDF set is obtained assuming that g2 is given 
by the Wandzura-Wilczek relation, Eq. (j22p . 

In order to assess the impact of these assumptions on our results, we have performed two more 
PDF determinations. In the first, we set m = consistently everywhere, both in the extraction of the 
structure functions from the asymmetry data and in our computation of structure functions. This thus 
removes TMCs, and also contributions proportional to 52- In the second, we retain mass effects, but 
we assume 92 = 0. 

The statistical estimators for each of these three fits over the full dataset are shown in Tab. [TTl 
Clearly, all fits are of comparable quality. 

Furthermore, in Fig. [11] we compare the PDFs at the initial scale Qg determined in these fits to our 
default set: differences are hardly visible. This comparison can be made more quantitative by using the 
distance d{x,Q'^) between different fits, as defined in Appendix A of Ref. p3|. The distance is defined 
in such a way that if we compare two different samples of A'rep replicas each extracted from the same 
distribution, then on average d = 1, while if the two samples are extracted from two distributions which 
differ by one standard deviation, then on average d = ^JN^ (the difference being due to the fact that 
the standard deviation of the mean scales as 1/ . 

The distances d(x, Q"^) between central values and uncertainties of the three fits of Tab. [TT] are shown 
in Fig. [T2J They never exceed d = 4, which means less than half a standard deviation for A'^rep = 100. 
It is interesting to observe that distances tend to be larger in the large-x region, where the expansion 
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Figure 10: The proton, neutron and deuteron polarized structure function gi (x, Q^) as functions of in different 
bins of X compared to experimental data. Experimental data are grouped in bins of x, while NNPDFpoll . results 
are given at the center of each bin, whose value is given next to each curve. In order to improve legibility, the 
values of gi{x, Q^) have been shifted by the amount given next to each curve. 



33 




Figure 11: Comparison between the default NNPDFpoll.O PDFs (labeled as 32 = 92^^ in the plot), PDFs with 
m — Q (labeled as noTMCs in the plot) and PDFs with 32 = 0; each corresponds to the statistical estimators of 
Tab. El 



in powers of rr? jQ'^ is less accurate, and the effects of dynamical higher twists can become relevant. It 
is reassuring that even in this region the distances are reasonably small. 

We conclude that inclusive DIS data, with our kinematic cuts, do not show sensitivity to finite 
nucleon mass effects, neither in terms of fit quality, nor in terms of the effect on PDFs. 

5.3.2 Sum rules 

Our default PDF fit is obtained by assuming that the triplet axial charge 03 is fixed to its value extracted 
from /3 decay, Eq. (137p . and that the octet axial charge ag is fixed to the value of ag determined from 
baryon octet decays, but with an infiated uncertainty in order to allow for SU(3) violation, Eq. (j39p . As 
discussed after Eq. (I5ip uncertainties on them are included by randomizing their values among replicas. 

In order to test the impact of these assumptions, we have produced two more PDF determinations. 
In the first, we have not imposed the triplet sum rule Eq. (|35l) . so in particular 03 is free and determined 
by the data, instead of being fixed to the value Eq. (j37p . In the second, we have assumed that the 
uncertainty on og is given by the much smaller value of Eq. (j38p . 

The statistical estimators for the total dataset for each of these fits are shown in Tab. 1121 Here too, 
there is no significant difference in fit quality between these fits and the default. 

The distances between PDFs in the default and the free 03 fits are displayed in Fig. [131 As one may 
expect, only the triplet is affected significantly: the central value is shifted by about d ~ 5, i.e. about 
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Figure 12: Distances between each pair of the three sets of PDFs shown in Fig. [TT] 
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Fit 


free 03 


as Eq. ^ 


Xtot 

{E) ± aE 

(i^tr) ± CJE,^ 
(^val) ± CTB^,, 


0.79 
1.84 ± 0.19 
1.73 ± 0.41 
1.93 ± 0.58 


0.77 

1.86 ± 0.19 
1.66 ± 0.53 

1.87 ± 0.71 




0.93 ± 0.12 


0.92 ± 0.15 



Table 12: The statistical estimators of Tab. IHl but for fits in which the triplet sum rule is not imposed (free 03) 
or in which the octet sum rule is imposed with the smaller uncertainty Eq. psp . 



NNPDFpoM .0: fixed vs fitted a 





Figure 13: Distances between PDFs (central values and uncertainties) for the default fit, with aa fixed, and the 
fit with free 03, computed using N,.^p = 100 replicas from each set. 



half-cj, in the region x ~ 0.3, where XAT3 has a maximum, and also around x ~ 0.01. The uncertainties 
on the PDFs are very similar in both cases for all PDFs, except AT3 at small-x: in this case, removing 
the as sum rule results in a moderate increase of the uncertainties; the effect of removing 03 is otherwise 
negligible. The singlet and triplet PDFs for these two fits are compared in Fig. [TH 

The distances between the default and the fit with the smaller uncertainty on as, Eq. (|38p . are 
shown in Fig. [T5j In this case, again as expected, the only effect is on the ATg uncertainty, which 
changes in the region 10~^ x < 10^^ by up to d ~ 6 (about half a standard deviation): if a more 
accurate value of ag is assumed, the determined ATg is correspondingly more accurate. Central values 
are unaffected. The singlet and octet PDFs for this fit are compared to the default in Fig. [TBI We 
conclude that the size of the uncertainty on ATg has a moderate effect on our fit; on the other hand 
it is clear that if the octet sum rule were not imposed at all, the uncertainty on the octet and thus on 
strangeness would increase very significantly, as we have checked explicitly. 

We conclude that our fit results are quite stable upon variations of our treatment of both the triplet 
and the octet sum rules. 

5.4 Positivity 

As discussed in Sect. IH positivity of the individual cross sections entering the polarized asymmetries 
Eq. ([1]) has been imposed at leading order according to Eq. (f6T]l , using the NNPDF2 . 1 NNLO PDF set [25] , 
separately for the lightest polarized quark PDF combinations Au + An, Ad + Ad, As + As and for the 
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Figure 15: Distances between PDFs (central values and uncertainties) for the default fit, with as Eq. ([39|) . and 
the fit with the value of ag with smaller uncertainty, Eq. ([38]). 



polarized gluon PDF, by means of a Lagrange multiplier Eq. (j62p . After stopping, positivity is checked 
a posteriori and replicas which do not satisfy it are discarded and retrained. 

In Fig. [T7] we compare to the positivity bound for the up, down, strange PDF combinations and 
gluon PDF a set of A^^cp = 100 replicas obtained by enforcing positivity through a Lagrange multiplier, 
but before the final, a posteriori check. Almost all replicas satisfy the constraint, but at least one 
replica which clearly violates it for the s + s combination (and thus will be discarded) is seen. 

In order to assess the effect of the positivity constraints we have performed a fit without imposing 
positivity. Because positivity significantly affects PDFs in the region where no data are available, and 
thus in particular their large x behaviour, preprocessing exponents for this PDF determination had to 
be determined again using the procedure described in Sect. 14. ll The values of the large x preprocessing 
exponents used in the fit without positivity are shown in Tab. [T3j The small x exponents are the same 
as in the baseline fit. Tab. \5\ 

The corresponding estimators are shown in Tab. [TH Also in this case, we see no significant change in 
fit quality, with only a slight improvement in Xtot when the constraint is removed. This shows that our 
PDF parametrization is flexible enough to easily accommodate positivity. On the other hand, clearly 
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Figure 16: Comparison of the singlet and octet PDFs for the default fit, with as Eq. (|39p . and the fit with the 
value of as with smaller uncertainty, Eq. (j38p . 



PDF 


m 


AS(x,Q^) 


[0.5,5.0] 




[0.5,5.0] 




[0.5,4.0] 


^Ts{x,Ql) 


[0.5,6.0] 



Table 13: Ranges for the large x preprocessing exponents Eq. (|48p for the fit in which no positivity is imposed. 
The small x exponents are the same as in the baseline fit Tab. [5j 



the positivity bound has a significant impact on PDFs, especially in the large x region, as shown in 
Fig. [181 where PDFs obtained from this fit are compared to the baseline. At small x, instead, the 
impact of positivity is moderate, because, as discussed in Sect. 14.41 9i/Pi ~ x as x ^ [64J so there 
is no constraint in the limit. This in particular implies that there is no significant loss of accuracy 
in imposing the LO positivity bound, because in the small x < 10~^ region, where the LO and NLO 
positivity bounds differ significantly [65] the bound is not significant. 

5.5 Small- and large-x behaviour and preprocessing 

The asymptotic behavior of both polarized and unpolarized PDFs for x close to or 1 is not controlled 
by perturbation theory, because powers of In ^ and ln(l — x) respectively appear in the perturbative 
coefficients, thereby spoiling the reliability of the perturbative expansion close to the endpoints. Non- 
perturbative effects are also expected to set in eventually (see e.g. [MIEZ]). For this reason, our fitting 
procedure makes no assumptions on the large- and small-x behaviors of PDFs, apart from the positivity 
and integrability constraints discussed in the previous Section. 

It is however necessary to check that no bias is introduced by the preprocessing. We do this 
following the iterative method described in Sect. 14.11 The outcome of the procedure is the set of 
exponents Eq. ()48p . listed in Tab. [5j The lack of bias with these choices is explicitly demonstrated in 
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Figure 17: The positivity bound Eq. (|6T|) . compared to a set of N^cp = 100 replicas (dashed lines). 



Fig. [ini where we plot the 68% confidence level of the distribution of 



a[Aq{x,Q^)] 



lnAq{x,Q'^ 



P[Aq{x,Q')] 



2,, lnAg(x,Q2) 



ln(l — x) 



(67) 
(68) 



Aq = AS, A^, ATs, ATs, for the default NNPDFpoll.O iVrcp = 100 replica set, at Q"^ = = 1 GeV^, 
and compare them to the ranges of Tab. O It is apparent that as the endpoints x = and x = 1 are 
approached, the uncertainties on both the small-x and the large-x exponents lie well within the range 
of the preprocessing exponents for all PDFs, thus confirming that the latter do not introduce any bias. 
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Figure 18: The NNPDFpoll . PDFs with and without positivity constraints compared at the initial parametriza- 
tion scale Qq = 1 GeV^ in the flavor basis. 
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Figure 19: The 68% confidence level of the distribution of effective small- and large- a; exponents Eqs. (|67ll68p 
for the default A^i-cp — 100 replica NNPDFpoll.O set at Qq = 1 GeV^, plotted as functions of x. The range of 
variation of the preprocessing exponents of Tab. [S]is also shown in each case (solid lines). 
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Fit 


NNPDFpoll .0 no positivity 


Xtot 


0.72 


{E) ± aE 


1.84 ± 0.22 


(Etr) ± CTEt, 


1.60 ± 0.20 


(i^val) ± a-E^,i 


2.07 ± 0.39 




0.95 ± 0.16 



Table 14: The statistical estimators of Tab. [9] for a fit without positivity constraints. 
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6 Polarized nucleon structure 



The NNPDFpoll.O PDF set may be used for a determination of the first moments of polarized parton 
distributions. As briefly summarized in the introduction, these are the quantities of greatest physical 
interest in that they are directly related to the spin structure of the nucleon, and indeed their deter- 
mination, in particular the determination of the first moments of the quark and gluon distributions, 
has been the main motivation for the experimental campaign of gi measurements. The determination 
of the isotriplet first moment, because of the Bjorken sum rule, provides a potentially accurate and 
unbiased handle on the strong coupling Og- 

6.1 First moments 

We have computed the first moments 

(A/(Q2))^ [\xAf{x,Q^) (69) 
Jo 

of each light polarized quark-antiquark and gluon distribution using a sample of A'rep = 100 NNPDFpoll . 
PDF replicas. The histogram of the distribution of first moments over the replica sample at Qq = 
1 GeV^ are displayed in Fig. [20) they appear to be reasonably approximated by a Gaussian. 

The central value and one-cr uncertainties of the quark first moments are listed in Tab. [151 while 
those of the singlet quark combination Eq. (j29p and the gluon are given in Tab. [T6j Results are 
compared to those from other parton sets, namely ABFR98 [8], DSSVIO [E], AAC08 [12], BBIO [U] 
and LSSIO [Tl]. Results from other PDF sets are not available for all combinations and scales, because 
public codes only allow for the computation of first moments in a limited x range, in particular down 
to a minimum value of x: hence we must rely on published values for the first moments. In particular, 
the DSSV and AAC results are shown at = 1 GeV^, while the BB and LSS results are shown at 

= 4 GeV^. For ease of reference, the NNPDF values for both scales are shown in Tab. [161 









h Au) 






(Ad 4 


-Ad) 






(As 4 


- A.s) 






cv 


exp 


th 


tot 


cv 


exp 


th 


tot 


cv 


exp 


th 


tot 


NNPDFpoll.O 


0.80 


0.08 




0.08 


-0.46 


0.08 




0.08 


-0.13 


0.09 




0.09 


DSSV08 [SS] 


0.817 


0.013 


0.008 


0.015 


-0.453 


0.011 


0.036 


0.038 


-0.110 


0.023 


0.098 


0.101 



Table 15: First moments of the polarized quark distributions at Qq = 1 GeV^; cv denotes the central value, 
whil exp and th denote uncertainties (see text) whose sum in quadrature is given by tot. 

In order to compare the results for first moments shown in Tabs. [T51I161 it should be understood 
that the uncertainties shown, and sometimes also the central values, have somewhat different meanings. 
In particular: 

• For NNPDFpoll . the exp uncertainty, determined as the standard deviation of the replica sample, 
is a pure PDF uncertainty: it includes the propagation of the experimental data uncertainties 
and the uncertainty due to the interpolation and extrapolation. 

• In the ABFR98 study, the central values were obtained in the so-called AB factorization scheme [6] . 
While the gluon in this scheme coincides with the gluon in the MS scheme used here (and thus 
the value from Ref. [8] for the gluon is shown in Tab. [IB]) , the quark singlet differs from it. 
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Figure 20: Distribution of the first moments of Au + Au (top left), Ad + Ad (top right), As + As (bottom left) 
over a set of A^^p = 100 NNPDFpoll . PDF replicas. 



However, in Ref. [8j a value of the singlet axial charge oq in the limit of infinite was also 
given. In the MS, the singlet axial charge and the first moment of AS coincide [6], hence we have 
determined (AE) for ABFR98 by evolving down to = 1 GeV^ the value of ao(oo) given in 
Ref. [8], at NLO and with as(M^) = 0.118 [69j (the impact of the as uncertainty is negligible). 
We have checked that the same result is obtained if oq is computed as the appropriate linear 
combination of (AS) in the AB scheme and the first moment of Ag. In the ABFR98 study, the 
exp uncertainty is the Hessian uncertainty on the best fit, and it thus includes the propagated 
data uncertainty. The th uncertainty includes the uncertainty originated by neglected higher 
orders (estimated by renormalization and factorization scale variations), higher twists, position 
of heavy quark thresholds, value of the strong coupling, violation of SU(3) (uncertainty on ag 
Eq. (I36p ). and finally uncertainties related to the choice of functional form, estimated by varying 
the functional form. This latter source of theoretical uncertainty corresponds to interpolation and 
extrapolation uncertainties which are included in the exp for NNPDF. 

• For DSSV08 and BBIO PDFs, the central value is obtained by computing the first moment integral 
of the best-fit with a fixed functional form restricted to the data region, and then supplementing 
it with a contribution due to the extrapolation in the unmeasured (small x) region. The exp 
uncertainty in the table is the Hessian uncertainty given by DSSV08 or BBIO on the moment in 
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(A5) 




cv 


exp 


th 


tot 


cv 


exp 


th 


tot 


NNPDFpoU.O (lGeV2) 
NNPDFpoll.O (4GeV2) 


0.22 


0.20 
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0.20 


-1.2 
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0.18 


0.20 




0.20 
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ABFR98 [8J 




0.12 


0.05 


+u.iy 

-0.12 


+u.iy 

-0.13 


1.6 


0.4 


0.8 
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DSSV08 [68] 




0.255 


0.019 


0.126 


0.127 


-0.12 


0.12 


0.06 


0.13 


AAC08 [12] 


(positive) 


0.26 


0.06 




0.06 


0.40 


0.28 




0.28 


(node) 


0.25 


0.07 




0.07 


-0.12 


1.78 




1.78 


BBIO [nj 




0.19 


0.08 


0.23 


0.24 


0.46 


0.43 


0.004 


0.43 


LSSIO [Ti] 


(positive) 


0.207 


0.034 




0.034 


0.316 


0.190 




0.190 


(node) 


0.254 


0.042 




0.042 


-0.34 


0.46 




0.46 



Table 16: Same as Tab. [1^1 but for the total singlet quark distribution and the gluon distribution. The NNPDF 
results are shown both at Qq = 1 GeV^ and = 4 GeV^, the ABFR, DSSV and AAC results are shown at 
Ql = 1 GeV2, and the BBIO and LSSIO are shown at ^ 4 GeV^). 

the measured region, and it thus includes the propagated data uncertainty. In both cases, we 
have determined the th uncertainty shown in the table as the difference between the full first 
moment quoted by DSSV08 or BBIO, and the first moment in the measured region. It is thus the 
contribution from the extrapolation region, which we assume to be 100% uncertain. In both cases, 
we have computed the truncated first moment in the measured region using publicly available 
codes, and checked that it coincides with the values quoted by DSSV08 and BBIO. 

• For AAC08, the central value is obtained by computing the first moment integral of the best-fit 
with a fixed functional form, and the exp uncertainty is the Hessian uncertainty on it. However, 
AAC08 uses a so-called tolerance [70] criterion for the determination of Hessian uncertainties, 
which rescales the Ax^ = 1 region by a suitable factor, in order to effectively keep into account 
also interpolation errors. Hence, the exp uncertainties include propagated data uncertainties, as 
well as uncertainties on the PDF shape. 

• For LSSIO, the central value is obtained by computing the first moment integral of the best-fit 
with a fixed functional form, and the exp uncertainty is the Hessian uncertainty on it. Hence it 
includes the propagated data uncertainty. 

In all cases, the total uncertainty is computed as the sum in quadrature of the exp and th uncertain- 
ties. Roughly speaking, for LSSIO this includes only the data uncertainties; for DSSV08, and BBIO it 
also includes extrapolation uncertainties; for AAC08 interpolation uncertainties; for NNPDFpoll.O both 
extrapolation and interpolation uncertainties; and for ABFR98 all of the above, but also theoretical 
(QCD) uncertainties. For LSSIO and AAC08, we quote the results obtained from two different fits, 
both assuming positive- or node-gluon PDF: their spread gives a feeling for the missing uncertainty due 
to the choice of functional form. Note that the AAC08 results correspond to their Set B which includes, 
besides DIS data, also RHIC vr*^ production data; the DSSV08 fit also includes, on top of these, RHIC 
jet data and semi-inclusive DIS data; LSSIO includes, beside DIS, also semi-inclusive DIS data. All 
other sets are based on DIS data only. 

Coming now to a comparison of results, we see that for the singlet first moment (AS) the NNPDFpoll . 
result is consistent within uncertainties with that of other groups. The uncertainty on the NNPDFpoll . 
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result is comparable (if somewhat larger) to that found whenever the extrapolation uncertainty has been 
included. For individual quark flavors (Tab. [T5]) we find excellent agreement in the central values ob- 
tained between NNPDFpoll.O and DSSV08; the NNPDF uncertainties are rather larger, but this could 
also be due to the fact that the DSSV08 dataset is sensitive to flavour separation. 

For the gluon first moment (Ag), the NNPDFpoll.O result is characterized by an uncertainty which 
is much larger than that of any other determination: a factor of three or four larger than ABFR98 and 
AAC08, ten times larger than BBIO, and twenty times larger than DSSV08 and LSSIO. It is compatible 
with zero within this large uncertainty. We have seen that for the quark singlet, the NNPDFpoll.O 
uncertainty is similar to that of groups which include an estimate of extrapolation uncertainties. In 
order to assess the impact of the extrapolation uncertainty for the gluon, we have computed the gluon 
first truncated moment in the region x E [10"'^, 1]: 

/ dxA5(x,Q2 = 1 GeV^) = -0.26 ± 1.19, (70) 

to be compared with the result of Tab. [THl which is larger by almost a factor four. 

We must conclude that the experimental status of the gluon first moment is still completely un- 
certain, unless one is willing to make strong theoretical assumptions on the behaviour of the polarized 
gluon at small x, and that previous different conclusions were affected by a significant under-estimate 
of the impact of the bias in the choice of functional form, in the data and especially in the extrapolation 
region. Because of the large uncertainty related to the extrapolation region, only low x data can improve 
this situation, such as those which could be collected at a high energy electron-ion collider |32y71j. 

6.2 The Bjorken sum rule 

Perturbative factorization, expressed in this context by Eq. (p8|) for the structure function gi{x,Q'^), 
and the assumption of exact isospin symmetry, immediately lead to the so-called Bjorken sum rule 
(originally derived [72l[73] using current algebra): 

r? (g2) - (Q2) = iACNs(a.(g'))a3 , (71) 

where 

rr(Q')^ tdxgr{x,Q'), (72) 
Jo 

and ACns(«s(Q^)) is the first moment of the non-singlet coefficient function, while as is defined in 
Eq. 

Because the first moment of the non-singlet coefficient function ACns is known up to three loops [71] 
and isospin symmetry is expected to hold to high accuracy, the Bjorken sum rule Eq. ([7T|) potentially 
provides a theoretically very accurate handle on the strong coupling constant: in principle, the truncated 
isotriplet first moment 

(g2, X^in) ^ r dx [gP {X, _ (a;, Q^)] (73) 

can be extracted from the data without any theoretical assumption. Given a measurement of F^^ (Q^, O) 
at one scale the strong coupling can then be extracted from Eq. (I7ip using the value of as from (3 decays, 
while given a measurement of T^^ (Q^, O) at two scales both as and the value of Os can be extracted 
simultaneously. 
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Figure 21: The truncated Bjorken sum rule Ff^ {Q^,x) Eq. (|73|) plotted as a function of x for = 1 GeV^, 
for the fit with free 03 (left) and for the reference fit with 03 fixed to the value Eq. (|37)) . (right panel). In the 
left plot, the shaded band corresponds to the asymptotic value of the truncated sum rule, Eq. (|75p . while in the 
right plot it corresponds to the experimental value Eq. p7|) . 



In Ref. |75j , 03 and as where simultaneously determined from a set of nonsinglet truncated moments 
(both the first and higher moments), by exploiting the scale dependence of the latter [76], with the 
result gA = 1.04 ± 0.13 and as{M^) = 0.126lj]:[j?^, where the uncertainty is dominated by the data, 
interpolation and extrapolation, but also includes theoretical (QCD) uncertainties. In this reference, 
truncated moments were determined from a neural network interpolation of existing data, sufficient 
for a computation of moments at any scale. However, because the small x behaviour of the structure 
function is only weakly constrained by data, the a; — )• extrapolation was done by assuming a powerlike 
(Regge) behaviour [77| . 

The situation within NNPDFpoll . can be understood by exploiting the PDF determination in which 
03 is not fixed by the triplet sum rule, discussed in Sect. 15.3.21 Using the results of this determination, 
we find ^ 

03 = / dx AT3{x, g2) = 1.19 ± 0.22 . (74) 
Jo 

The uncertainty is about twice that of the determination of Ref. [75]. As mentioned, the latter was 
obtained from a neural network parametrization of the data with no theoretical assumptions, and 
based on a methodology which is quite close to that of the NNPDFpoll . PDF determination discussed 
here, the only difference being the assumption of Regge behaviour in order to perform the small x 
extrapolation. This strongly suggests that, as in the case of the gluon distribution discussed above, the 
uncertainty on the value Eq. (j74p is dominated by the small x extrapolation. 

To study this, in Fig. [2T]we plot the value of the truncated Bjorken sum rule F^^ (Q^, a^min) Eq. (|73ll 
as a function of the lower limit of integration Xmm at Qq = 1 GeV^, along with the asymptotic value 

(1 GeV^ 0) = 0.16 ± 0.03 (75) 

which at NLO corresponds to the value of 03 given by Eq. (I74p . As a consistency check, we also show 
the same plot for our baseline fit, in which 03 is fixed by the sum rule to the value Eq. (j37p . It is clear 
that indeed the uncertainty is completely dominated by the small x extrapolation. 

This suggests that a determination of ag from the Bjorken sum rule is not competitive unless 
one is willing to make assumptions on the small x behaviour of the nonsinglet structure function in 
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the unmeasured region. Indeed, it is clear that a determination based on NNPDFpoll.O would be 
affected by an uncertainty which is necessarily larger than that found in Ref. [75], which is already not 
competitive. The fact that a determination of from the Bjorken sum rule is not competitive due to 
small X extrapolation ambiguities was already pointed out in Ref. |8j, where values of 03 and Os similar 
to those of Ref. [75] were obtained. 
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7 Conclusions and outlook 



We have presented a first determination of polarized parton distributions based on tlie NNPDF method- 
ology: NNPDFpoll.O. We have determined polarized PDFs from the most recent inclusive data on 
proton, deuteron and neutron deep-inelastic polarized asymmetries and structure functions. Our main 
result is that the uncertainty in the gluon distribution, and to a lesser extent the strange distribution, 
and in the small x extrapolation for all parton distributions, is rather larger than in previous polar- 
ized PDF determinations. Also, there seems to be some tension between strangeness determined in 
deep-inelastic scattering and using sem-inclusive data. 

In particular, we find that the role of the gluon distribution in the spin structure of the nucleon is 
essentially unknown, as the first moment of the gluon distribution is compatible with zero, but with an 
uncertainty which is compatible with a very large positive or negative gluon spin fraction. Likewise, 
the contribution from the small x region to the Bjorken sum rule makes its use as a means to determine 
as essentially impossible. Different conclusions can be reached only if one is willing to make strong 
theoretical assumptions on the small x behaviour of polarized PDFs. 

Future experiments, in particular open charm and hadron production in fixed target experiments, \78\ 
[87] inclusive jet production |80ll81j and W boson production |82H84j from the RHIC collider may im- 
prove the knowledge on individual polarized flavors and antiflavors and on the gluon distribution in the 
valence region. However, only a high-energy electron-ion collider |32p71j might provide information on 
polarized PDFs at small x and thus reduce the uncertainty on first moments in a significant way. 



The NNPDFpoll.O polarized PDFs, with iVrep = 100 rephcas, are available from the NNPDF HEP- 
FORGE web site, 

http : //nnpdf . hepf orge . org/ . 

A Mathematica driver code is also available from the same source. 
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Table 17: Percentage difference between FastKernel perturbative evolution of polarized PDFs and the Les 
Houches benchmark tables [85] for different polarized PDF combinations at NLO in the ZM-VFNS scheme. 

A Benchmarking of polarized PDF evolution 

We have benchmarked our implementation of the evolution of polarized parton densities by cross- 
checking against the Les Houches polarized PDF evolution benchmark tables [85j. Note that in Ref. [85] 
the polarized sea PDFs are given incorrectly, and should be 

x/\u = -0.045x°-3(l - 
xAd= -0.055x°-3(l - xf . (76) 

These tables were obtained from a comparison of the HOPPET |58] and PEGASUS |86] evolution codes, 
which are x— space and A^— space codes respectively. In order to perform a meaningful comparison, 
we use the so-called iterated solution of the A^— space evolution equations and use the same initial 
PDFs and running coupling as in |85J . The relative difference erei between our PDF evolution and the 
benchmark tables of Refs. [85] at NLO in the ZM-VFNS scheme are tabulated in Tab. [TTj for various 
combinations of polarized PDFs: the accuracy of our code is O (l0-5) for all relevant values of x, which 
is the nominal accuracy of the agreement between HOPPET and PEGASUS. 

Therefore, we can conclude that the accuracy of the polarized PDF evolution in the FastKernel 
framework is satisfactory for precision phenomenology. 
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